Skip to content

Conversation

@tarruda
Copy link

@tarruda tarruda commented Aug 6, 2025

TBH I'm not sure if this is the correct fix, but it seems to work fine locally if I also patch the jinja2 template (https://huggingface.co/ggml-org/gpt-oss-120b-GGUF) with this:

--- gpt-oss.j2	2025-08-06 10:59:18
+++ gpt-oss.j2  2025-08-06 11:00:43
@@ -303,5 +303,5 @@

 {#- Generation prompt #}
 {%- if add_generation_prompt -%}
-<|start|>assistant
+<|start|>assistant<|channel|>analysis
 {%- endif -%}

This might be a partial fix for #15110

@tarruda tarruda force-pushed the fix-gpt-oss-reasoning-parse branch from 606ea13 to 896f2e6 Compare August 6, 2025 14:17
This also requires patching the Jinja2 template:

```diff
--- gpt-oss.j2.old	2025-08-06 10:59:18
+++ gpt-oss.j2.new	2025-08-06 11:00:43
@@ -303,5 +303,5 @@

 {#- Generation prompt #}
 {%- if add_generation_prompt -%}
-<|start|>assistant
+<|start|>assistant<|channel|>analysis
 {%- endif -%}
```
@tarruda tarruda force-pushed the fix-gpt-oss-reasoning-parse branch from 896f2e6 to 34786fe Compare August 6, 2025 14:18
@tarruda tarruda changed the title Fix "<|channel|>analysis" appearing when chatting via llama-server API Fix "<|channel|>analysis" appearing when chatting via llama-server API (#15110 ?) Aug 6, 2025
@tarruda tarruda changed the title Fix "<|channel|>analysis" appearing when chatting via llama-server API (#15110 ?) Fix "<|channel|>analysis" appearing when chatting via llama-server API Aug 6, 2025
@nleve
Copy link

nleve commented Aug 6, 2025

I had the same issue (using https://huggingface.co/unsloth/gpt-oss-120b-GGUF with --jinja) and can confirm this fixed it for me.

@ngxson
Copy link
Collaborator

ngxson commented Aug 6, 2025

I would prefer a fix without changing jinja template. I think we should simply extend the builder class to accept a sequence of strings instead of a single string, something like this:

builder.try_parse_reasoning({"<|channel|>", "analysis", "<|message|>"}, ...

One of the goals of having a jinja template is to provide a ground truth template cross runtimes. Adding <|channel|>analysis as you proposed will break the compatibility for certain use cases where application expect to see it in the response stream.

@tarruda
Copy link
Author

tarruda commented Aug 6, 2025

@ngxson makes sense, closing this one then!

@tarruda tarruda closed this Aug 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants